Applying Many-to-Many Alignments and Hidden Markov Models to Letter-to-Phoneme Conversion
نویسندگان
چکیده
Letter-to-phoneme conversion generally requires aligned training data of letters and phonemes. Typically, the alignments are limited to one-to-one alignments. We present a novel technique of training with many-to-many alignments. A letter chunking bigram prediction manages double letters and double phonemes automatically as opposed to preprocessing with fixed lists. We also apply an HMM method in conjunction with a local classification model to predict a global phoneme sequence given a word. The many-to-many alignments result in significant improvements over the traditional one-to-one approach. Our system achieves state-of-the-art performance on several languages and data sets.
منابع مشابه
Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملLetter-to-Phoneme Conversion for a German Text-to-Speech System
This thesis deals with the conversion from letters to phonemes, syllabification and word stress assignment for a German text-to-speech system. In the first part of the thesis (chapter 5), several alternative approaches for morphological segmentation are analysed and the benefit of such a morphological preprocessing component is evaluated with respect to the grapheme-to-phoneme conversion algori...
متن کاملIntroducing Busy Customer Portfolio Using Hidden Markov Model
Due to the effective role of Markov models in customer relationship management (CRM), there is a lack of comprehensive literature review which contains all related literatures. In this paper the focus is on academic databases to find all the articles that had been published in 2011 and earlier. One hundred articles were identified and reviewed to find direct relevance for applying Markov models...
متن کاملDeep Bidirectional Long Short-Term Memory Recurrent Neural Networks for Grapheme-to-Phoneme Conversion Utilizing Complex Many-to-Many Alignments
Efficient grapheme-to-phoneme (G2P) conversion models are considered indispensable components to achieve the stateof-the-art performance in modern automatic speech recognition (ASR) and text-to-speech (TTS) systems. The role of these models is to provide such systems with a means to generate accurate pronunciations for unseen words. Recent work in this domain is based on recurrent neural networ...
متن کاملHidden Markov models with context-sensitive observations for grapheme-to-phoneme conversion
Hidden Markov models (HMMs) have proven useful in various aspects of speech technology from automatic speech recognition through speech synthesis, speech segmentation and grapheme-to-phoneme conversion to part-of-speech tagging. Traditionally, context is modelled at the hidden states in the form of context-dependent models. This paper constitutes an extension to this approach; the underlying co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007